binomial distribution
854a9ab0f323b841955e70ca383b27d1-Supplemental-Conference.pdf
To be specific, for every node in thei-th514 class, we use a binomial distribution with meanpin = hin/hto generate ah-dimensional binary515 vector as its((i 1) h+1)-th to (i h)-th attributes, and generated the rest attributes using516 a binomial distribution with meanpout = hout/(3h). In our experiments, we set4h = 200 and517 hout = 4(hin +hout = 16), so thatpin > pout, theh-dimensional attributes are associated with518 thei-th class with ahigher probability,whereas the rest3hattributes are irrelevant. Finally,537 we show the influence of the number of data augmentation in Figure 6 (d). With the increase of538 S,the node classification performance improvessteadily until stabilizes. Local567 variation algorithms differ only in the type of contraction sets that they consider: Variation Edges568 only contracts edges, whereas contraction sets in Variation Neighborhoods are subsets of nodes'569 neighborhood. Then, ANS-GT combines theweighted576 16 Table 6: Efficiencycomparisons with Graph Transfomer baselines.
Stochastic Predictive Analytics for Stocks in the Newsvendor Problem
The Newsvendor problem is a fundamental model in inventory management (Rossi, 2021) that accommodates both known (Dvoretzky et al., 1952a) and unknown (Dvoretzky et al., 1952b) demand distributions. Since its inception (Edgewort, 1888), it has been widely applied in inventory control and policy-making (Arrow et al., 1951), as well as various real-world situations (Choi, 2012; Chen et al., 2016). Its simplicity stems from considering a single product for sale, for which the optimal initial stock level must be determined to satisfy forecasted demand over a given period without restocking. The interplay among purchasing cost, selling price, and stock ordered at the beginning of the period determines the inventory management policies (Whitin, 1952; Rosenblatt, 1954; Petruzzi and Dada, 1999). The model has been extensively studied for single stock-keeping units (SKUs). Electronic marketplaces introduce an extra complication to the problem, as they need to manage a large number of SKUs at distribution centers alongside highly variable demand received through electronic platforms.
Global Optimization of Stochastic Black-Box Functions with Arbitrary Noise Distributions using Wilson Score Kernel Density Estimation
Iversen, Thorbjørn Mosekjær, Sørensen, Lars Carøe, Mathiesen, Simon Faarvang, Petersen, Henrik Gordon
Many optimization problems in robotics involve the optimization of time-expensive black-box functions, such as those involving complex simulations or evaluation of real-world experiments. Furthermore, these functions are often stochastic as repeated experiments are subject to unmeasurable disturbances. Bayesian optimization can be used to optimize such methods in an efficient manner by deploying a probabilistic function estimator to estimate with a given confidence so that regions of the search space can be pruned away. Consequently, the success of the Bayesian optimization depends on the function estimator's ability to provide informative confidence bounds. Existing function estimators require many function evaluations to infer the underlying confidence or depend on modeling of the disturbances. In this paper, it is shown that the confidence bounds provided by the Wilson Score Kernel Density Estimator (WS-KDE) are applicable as excellent bounds to any stochastic function with an output confined to the closed interval [0;1] regardless of the distribution of the output. This finding opens up the use of WS-KDE for stable global optimization on a wider range of cost functions. The properties of WS-KDE in the context of Bayesian optimization are demonstrated in simulation and applied to the problem of automated trap design for vibrational part feeders.
Thompson Sampling-like Algorithms for Stochastic Rising Bandits
Fiandri, Marco, Metelli, Alberto Maria, Trovò, Francesco
Stochastic rising rested bandit (SRRB) is a setting where the arms' expected rewards increase as they are pulled. It models scenarios in which the performances of the different options grow as an effect of an underlying learning process (e.g., online model selection). Even if the bandit literature provides specifically crafted algorithms based on upper-confidence bounds for such a setting, no study about Thompson sampling TS-like algorithms has been performed so far. The strong regularity of the expected rewards in the SRRB setting suggests that specific instances may be tackled effectively using adapted and sliding-window TS approaches. This work provides novel regret analyses for such algorithms in SRRBs, highlighting the challenges and providing new technical tools of independent interest. Our results allow us to identify under which assumptions TS-like algorithms succeed in achieving sublinear regret and which properties of the environment govern the complexity of the regret minimization problem when approached with TS. Furthermore, we provide a regret lower bound based on a complexity index we introduce. Finally, we conduct numerical simulations comparing TS-like algorithms with state-of-the-art approaches for SRRBs in synthetic and real-world settings.
Acceptance or Rejection of Lots while Minimizing and Controlling Type I and Type II Errors
Ursini, Edson Luiz, Poletti, Elaine Cristina Catapani, da Silveira, Loreno Menezes, Leite, José Roberto Emiliano
The double hypothesis test (DHT) is a test that allows controlling Type I (producer) and Type II (consumer) errors. It is possible to say whether the batch has a defect rate, p, between 1.5 and 2%, or between 2 and 5%, or between 5 and 10%, and so on, until finding a required value for this probability. Using the two probabilities side by side, the Type I error for the lower probability distribution and the Type II error for the higher probability distribution, both can be controlled and minimized. It can be applied in the development or manufacturing process of a batch of components, or in the case of purchasing from a supplier, when the percentage of defects (p) is unknown, considering the technology and/or process available to obtain them. The power of the test is amplified by the joint application of the Limit of Successive Failures (LSF) related to the Renewal Theory. To enable the choice of the most appropriate algorithm for each application. Four distributions are proposed for the Bernoulli event sequence, including their computational efforts: Binomial, Binomial approximated by Poisson, and Binomial approximated by Gaussian (with two variants). Fuzzy logic rules are also applied to facilitate decision-making.
Improving Value-based Process Verifier via Structural Prior Injection
Sun, Zetian, Li, Dongfang, Hu, Baotian, Yu, Jun, Zhang, Min
In the Large Language Model(LLM) reasoning scenario, people often estimate state value via Monte Carlo sampling. Though Monte Carlo estimation is an elegant method with less inductive bias, noise and errors are inevitably introduced due to the limited sampling. To handle the problem, we inject the structural prior into the value representation and transfer the scalar value into the expectation of a pre-defined categorical distribution, representing the noise and errors from a distribution perspective. Specifically, by treating the result of Monte Carlo sampling as a single sample from the prior ground-truth Binomial distribution, we quantify the sampling error as the mismatch between posterior estimated distribution and ground-truth distribution, which is thus optimized via distribution selection optimization. We test the performance of value-based process verifiers on Best-of-N task and Beam search task. Compared with the scalar value representation, we show that reasonable structural prior injection induced by different objective functions or optimization methods can improve the performance of value-based process verifiers for about 1$\sim$2 points at little-to-no cost. We also show that under different structural prior, the verifiers' performances vary greatly despite having the same optimal solution, indicating the importance of reasonable structural prior injection.
No-regret incentive-compatible online learning under exact truthfulness with non-myopic experts
Komiyama, Junpei, Mehta, Nishant A., Mortazavi, Ali
We study an online forecasting setting in which, over $T$ rounds, $N$ strategic experts each report a forecast to a mechanism, the mechanism selects one forecast, and then the outcome is revealed. In any given round, each expert has a belief about the outcome, but the expert wishes to select its report so as to maximize the total number of times it is selected. The goal of the mechanism is to obtain low belief regret: the difference between its cumulative loss (based on its selected forecasts) and the cumulative loss of the best expert in hindsight (as measured by the experts' beliefs). We consider exactly truthful mechanisms for non-myopic experts, meaning that truthfully reporting its belief strictly maximizes the expert's subjective probability of being selected in any future round. Even in the full-information setting, it is an open problem to obtain the first no-regret exactly truthful mechanism in this setting. We develop the first no-regret mechanism for this setting via an online extension of the Independent-Event Lotteries Forecasting Competition Mechanism (I-ELF). By viewing this online I-ELF as a novel instance of Follow the Perturbed Leader (FPL) with noise based on random walks with loss-dependent perturbations, we obtain $\tilde{O}(\sqrt{T N})$ regret. Our results are fueled by new tail bounds for Poisson binomial random variables that we develop. We extend our results to the bandit setting, where we give an exactly truthful mechanism obtaining $\tilde{O}(T^{2/3} N^{1/3})$ regret; this is the first no-regret result even among approximately truthful mechanisms.